Oligopeptides' frequencies in the classification of proteins' primary structures
نویسندگان
چکیده
This paper reports about an approach to the classification of proteins’ primary structures taking advantage of the Self Organizing Maps algorithm and of a numerical coding of the aminoacids based upon their physicochemical properties. Hydrophobicity, volume, surface area, hydrophilicity, bulkiness, refractivity and polarity were subjected to a Principal Component Analysis and the first two principal components, explaining 84.8 % of the total observed variability, were used to cluster the aminoacids into 4 or 5 classes through a k-means algorithm. This leads to an economical representation of the primary structures which, in the construction of the input vectors for the Self Organizing Maps algorithm, allows the consideration of up to triand tetrapeptides’ frequency matrices with minimal computational overload. In comparison with previously explored conditions, namely symbolic coding of aminoacids and dipeptides frequencies, no significant improvement was observed in the classification of 69 cytochromes of the c type, characterized by a high degree of structural and functional similarity, while a substantial improvement occurred in the case of a data set including quite heterogeneous primary structures.
منابع مشابه
Fault location and classification in non-homogeneous transmission line utilizing breaker transients
In this paper, a single-ended fault location method is presented based on a circuit breaker operation using the frequencies of traveling waves. The proposed method receives the required data from voltage traveling waves with the aid of Fast Fourier Transform (FFT) and Wavelet Transform. Then, the Artificial Neural Network (ANN) identifies fault type and determines its location. In order to eval...
متن کاملAbsolute Net Charge and the Biological Activity of Oligopeptides
Sequences of human proteins are frequently prepared as synthetic oligopeptides to assess their functional ability to act as compounds modulating pathways involving the parent protein. Our objective was to analyze a set of oligopeptides, to determine if their solubility or activity correlated with features of their primary sequence, or with features of properties inferred from three-dimensional ...
متن کاملThe EROP-Moscow oligopeptide database
Natural oligopeptides may regulate nearly all vital processes. To date, the chemical structures of nearly 6000 oligopeptides have been identified from >1000 organisms representing all the biological kingdoms. We have compiled the known physical, chemical and biological properties of these oligopeptides--whether synthesized on ribosomes or by non-ribosomal enzymes--and have constructed an intern...
متن کاملFunctional Annotation of Two Hypothetical Proteins Reveals Valuable Proteins Involved in Response to Salinity: An in silico Approach
Through the exponential development in the specification of sequences and structures of proteins by genome sequencing and structural genomics approaches, there is a growing demand for valid bioinformatics methods to define these proteins function. In this study, our objective is to identify the function of unknown proteins from UCB-1 pistachio rootstock and specify their class...
متن کاملCalculation of Buckling Load and Eigen Frequencies for Planar Truss Structures with Multi-Symmetry
In this paper, the region in which the structural system is situated is divided into four subregions, namely upper, lower, left and right subregions. The stiffness matrix of the entire system is then formed and using the existing direct symmetry and reverse symmetry, the relationships between the entries of the matrix are established. Examples are included to illustrate the steps of the method.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998